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Abstract 

We prove a general duality result showing that a Brascamp-Lieb type inequality is equivalent 
to an inequality expressing subadditivity of the entropy, with a complete correspondence of best 
constants and cases of equality. This opens a new approach to the proof of Brascamp-Lieb 
type inequalities, via subadditivity of the entropy. We illustrate the utility of this approach by 
proving a general inequality expressing the subadditivity property of the entropy on R", and 
fully determining the cases of equality. As a consequence of the duality mentioned above, we 
obtain a simple new proof of the classical Brascamp-Lieb inequality, and also a fully explicit 
determination of all of the cases of equality. We also deduce several other consequences of 
the general subadditivity inequality, including a generalization of Hadamard's inequality for 
determinants. Finally, we also prove a second duality theorem relating superadditivity of the 
Fisher information and a sharp convolution type inequality for the fundamental eigenvalues of 
Schrodinger operators. Though we focus mainly on the case of random variables in R" in this 
paper, we discuss extensions to other settings as well. 
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1 Introduction 



Let (r2, S, fj,) be a measure space, and let / be a probability density on {Q, S, fi). That is, / is a non 
negative integrable function on with fdfi = 1. On the convex subset of probability densities 

I / : j^/ln(l + /)dM<cx) I , (1.1) 

the entropy of /, S{f), is defined by 

Sif)= I /(x)ln/(x)d^(x). 
Jn 

With this sign convention for the entropy, the inequalities we derive are of superadditive type; 
however, the terminology "subadditivity of the entropy" is too well entrenched to use anything 
else. 

Now let p : i7 — > M be measurable. Let ly he a Borel measure on R, and define /(p) to be the 
probability density on (M, B, v) such that for all bounded continuous functions cj) on M, 

/ (t>{p{x))f{x)d^l{x)= [ (t>{t)fip){t)du{t). (1.2) 

In other words, the measure /(p) dz^ is the "push-forward" of the measure f dfi under p: 

The entropy of ^(/(p)) is defined just as S{f) was, except with (M, ;B,i^) replacing {Q,S,^). We 
shall be concerned with the following two questions: 

(1) Given m measurable functions pi, ■ ■ ■ ,Pm on 17, and m nonnegative numbers ci, . . . , Cm, is there 
a finite constant D such that 

m 

J2cjS{f^p^^)<S{f)+D (1.3) 
for all probability densities / with finite entropy (i.e. satisfying (jl.ip )? 

(2) Given m measurable functions pi, ■ ■ ■ ,Pm on ^l, and m nonnegative numbers ci, . . . , Cm, is there 
a finite constant D such that 

^ j=i j=i ' 

for any m nonnegative functions fi, ■ ■ ■ , fm on R? 

For example, consider the case that 0, = M"" with its standard Euclidean structure, and /i is 
Lebesgue measure on R*^, while v is Lebesgue measure on R. Suppose that ai, . . . , am are m vectors 
that span R", and define 

Pj{x) = aj ■ X . 

In this case, if we let X denote a random vector with values in R" whose law has the density /, 
then f^p.-j is simply the density of the law of aj ■ X. If we define the entropy of a random variable 
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to be the entropy of its density, provided it has one, we can rewrite this Euchdean version of ()1.3p 
as 

m 

^CjS{aj-X) < S{X)+D. 
i=i 

In case m = n, cj = 1 for ah j, and {ai, . . . , a„} is an orthonormal basis of M", then this inequahty 
holds with D = 0, and is the classical subadditivity of the entropy inequality. 

It is even easier to recognize (|1.4p as a classical result in this setting: It becomes 

which is the classical Brascamp-Lieb inequality. A celebrated theorem of Brascamp and Lieb [9] 
says that the best constant in this inequality can be computed by using only centered Gaussian 
functions as trial functions. A new proof based on optimal mass transport was given by Barthe [2] 
who also gave a characterization (depending on the vectors aj and the constants Cj) of when the 
constant is finite together with a description of the optimizers in some situations. Carlen, Lieb 
and Loss [H] introduced a new approach to the Brascamp-Lieb inequalities based on heat flow (see 
also [3|). These authors also completed the gaps left by Barthe in the description of the optimizers. 
Bennett, Carbery, Christ and Tao [B] used a similar approach to deal with the multidimensional 
versions of the Brascamp-Lieb inequality (see also [7] for a direct approach of the finiteness of 
the constant e^). The paper [11] (and [6j in the multidimensional setting) develops a "splitting 
procedure" that will prove useful in our situation too. But we shall see that working with entropy 
clarifies many technical points. 

There are a number of other examples, besides the classical one in the Euclidean setting, where 
choices of (f2,/u) and v lead to inequalities of interest. For a second example, take O = 5"~^, the 
unit sphere in M", and let ^ be the uniform probability measure on S""^. Then take v to be the 
probability measure on M that is the law of u • x, where u is any unit vector in M", so that for all 
continuous functions ^, 

(By the rotational invariance of /i, this does not depend on the choice of u.) Now let {ei, . . . ,e„} 
denote the standard orthonormal basis in M", and define Pj{x) on S^~^ by Pj{x) = ej ■ x. Then 
one has the optimal inequalities 

n 

^2^(Ap.))<^(/)' (1-5) 
i=i 

for any probability density / on (J7, fi) with finite entropy, and 

^ n n , „ ^ 1/2 n / „ \ 1/2 

y^„_, n • ^) ^ n (y^„_, • ^) ^z^^^)) = n (^y ^ ^ /k*) j - (i-^) 

for any n nonnegative functions /i , • • • , /n on [—1,1]. See [11] for the original proofs of (jl.Sp and 
(jl.6p , in which (jl.5p was deduced from ()1.6p . See |4J for a different and direct proof of (|1.5p . 

Since we are concerned in this paper with the relation between subadditivity of entropy and 
Brascamp-Lieb type inequalities, it is worth recalling the short argument from [IT] that provided the 
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passage from (jl.6p to (jl.Sp : Let / be any probability density on 5" ^, and let /(p^), /(p2)' ■ ■ ■ ■> f{pn) 
be its n marginals, as above. Then define another probability density g on 5"~^ by 

^ n „ n 

•= n Ap/)^^-'' ■ ^^^^^ *^ / - n Ap?)^^^' ■ ^^^^^ • 

Then by positivity of the relative entropy (Jensen's inequality), we have 

° - (f ) ^ = ^^^^ - [ft ■ ^^'^ '^^"^ ^ 



n 

= 5(/)-2E^(Ap.)) + 1^^- (1-7) 
Finally, (II. 6p implies that 



,. n n , „ s 1/2 

c = y^„_^ n /fe) (e^- • ^) dM(^) ^ n ( y^„_, /fe) • ^) d^(^) j = ^ 

since each /(p^.) is a probability density. Thus, ln(C) < 0, so that p.6p now follows from p.7p . This 
argument may give the impression that (jl.6p is a "stronger" inequality than p.Sp . but as we shall 
see, this is not the case. 

For a third example, take = 5^, the symmetric group on n letters. Let ^ be the uniform 
probability measure on Sn, and take v to be the uniform probability measure on {1, 2, . . . , n}, so 
= 1/n for all i. Define the functions pj : Q ^ {1,2, ... ,n} C M by Pj(o") = (T{j) for any 
permutation a of {1, 2, . . . , n} Then one has the optimal inequalities 



n 

5^2'^(/(p,))<5(/), (1.8) 
i=i 

for any probability density / on (J7,/x), and 

H n / \ 1/2 n / TL \ 1/2 

/ n/i(p.(^))d/i(^)<n( / /i(p,(a))dMa))) =n E/i(^H^) - (^-q) 

for any n nonnegative functions /i, . . . , /„ on {1, . . . , n}. See [12] for the proof of (|1.9p . One could 
then derive (|1.8p using the exact same argument that was used to derive p.5p from p.6p . 

There are more examples of interesting specializations of (jl.3p and (jl.4p . However, these ex- 
amples suffice to illustrate the context in which the present work is set, and we now turn to the 
results. One basic result of this paper is the following: 

The two questions concerning il.3\) and that were raised above are in fact one and the 

same: We shall prove here that the answer to one question is "yes " if and only if the answer to the 
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other question is "yes 
of equality. 



with the same constant D, and with a complete correspondence of cases 



Thus, if one's goal is to prove a generalized Brascamp-Lieb type inequality, one possible route 
is to directly prove the corresponding generalized subadditivity of the entropy inequality. We shall 
demonstrate the utility of this approach by giving a simple proof of the classical Brascamp-Lieb 
inequality on , including a determination of all of the cases of equality, through a direct analysis 
of the entropy. We shall use rather elementary properties of the entropy (scaling properties and 
conditional entropy) together with geometric properties of the Fisher information. Moreover, the 
generalized subadditivity of the entropy inequality that we prove here is new (in its full generality) , 
and is interesting in and of itself. As we shall see, it turns out to have a rich geometric structure. 
From the point of view of information theory, it might also be of interest to use the converse impli- 
cation and to reinterpret some Brascamp-Lieb inequalities (such as the sharp Young's convolution 
inequality) in terms of subadditivity inequalities for the entropy. 

The rest of the paper is organized as follows. In Section 2, we give the proof that (jl.3p and (jl.4p 
are dual to one another, so that once one has one inequality established with the cases of equality 
determined, one has the same for the other. We shall state this duality in a very general setting. 

In Section 3, we prove the sharp version of the general Euclidean subadditivity of the entropy 
inequality. 

In Section 4 we shall deduce some interesting consequences from this, including a generalization 
of Hadamard's inequality for the determinant. 

The final Section 5 gives another duality result showing that the superadditivity inequalities for 
Fisher information are dual to certain convolution type inequalities of ground state eigenvalues of 
Schrodinger operators. These inequalities appear to be new. They may be of some intrinsic interest, 
but our interest in them here is that a direct proof of the eigenvalue inequalities would yield a 
direct proof of Fisher information inequalities that would in turn yield entropy and Brascamp-Lieb 
inequalities. 

2 Duality of the Brascamp-Lieb inequality and subadditivity of 
the entropy 

We show that the Brascamp-Lieb inequality is dual to the subadditivity of the entropy, so that 
once one has proved one of these inequalities with sharp constants, one has the other with sharp 
constants too. In fact, we shall see that there is an exact correspondence also for cases of equality, 
but in the next theorem, we focus on the constants. 

We shall state the result in a more general setting than the one described in the introduction. We 
consider a reference measure space (0,5,/i) and a family of measure spaces {Mj, Mj^Uj) together 
with measurable functions : $7 — > Mj, j < m. For a probability density / on (with respect 
to fi), the marginal /(p^) is thus defined as the probability density on Mj (with respect to I'j) such 



that 
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for all bounded measurable functions on Mj ; accordingly the entropies are given by 

S{f)= fHDdf^ and S{f(^p^)) = /(p^.) ln(/(p^.)) di/^ . 
Jn J Mj 

As explained in the introduction, we are mainly interested in the case {Mj,Mj,i'j) = (K,B,i') for 
all j < m, where v is the Lebesgue measure on M. 



2.1 THEOREM. Let be a measure space, m > 1 and for j < m, let [Mj, Mj,Uj) he a 

measure space together with a measurable function pj from 17 to Mj . For any probability density f 
on Vt, let fi^p.-j the probability density on Mj be defined as in \2.1\) . Finally, let {ci, . . . ,Cm} be any 
set of m nonnegative numbers. 

Then for any D gM, the following two assertions are equivalent: 

1. For any m nonnegative functions fj : Mj M+, j < m, we have 

TTl TTl / \ ^3 

/ri/.bi(^))dM^)<e^ri / f]''^{t)dv,{t)] . (2.2) 

2. For every probability density f on (fi, S, /x) with finite entropy, we have 

m 

^c,S{f^p^^)<S{f) + D. (2.3) 
i=i 

The proof depends an a well known expression for the entropy as a Legendre transform: For 
any probability density f in Q, and any function cf) such that e'^ is integrable, 

fln(^) d/x= / /0d/i- / flnfdfi. 
\ J / Jn Jn 

On the other hand, by Jensen's inequality. 
Therefore, 

'' flnfdf, + \n( [ eUfi]> [ f^d^^, (2.4) 



and there is equality if and only if e'^ is a constant multiple of / on the support of /. We shall use 
that this Legendre duality nicely combines with the operation of taking marginals. 

Proof of Theorem I2.lt First, assume (|2.2p . Consider any probability density / on fi, and any 
m functions (j)j on Mj, j < m. Using ()2.4p with (j) defined on 0, by 

m 

4>{x):=^c,(t>j{Pj{x)) (2.5) 
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and ()2.ip we get 

^/(x)ln/(x)d/i > ^/(x) ||f;c,0,(p,(x))j d/z-ln|^^ne^^^ 

m ^ / „ m 



,=1 -^A^. \-^f^,=l 



(2.6) 



Then from the assumption (|2.2p apphed with = e'f'^ , 

/■ TTe^^<^^(^'^(^))dMa;)<e^fr / e<^^Wdz.,W 



Therefore, (12.60 becomes 



/ /(x)ln/(x)d^(x) > ( /" /(p^,)(t)0,(Odi.,(t) -In ( [ 



5*^Wdz^j(t)j j -D . (2.7) 

Now the optimal choice (pj = ln/(p^.) leads to ()2.3p . 

Conversely, suppose that (|2.3p is true. Consider m functions (j)j on Mj, j < m, and define (p on 
as in (j2.5p . Suppose that e"^ is integrable, and choose / to be the probability density 

/(x)= (y^e'^(-)d/i(x)) 'e^(^) , (2.8) 
so that there is equality in (|2.4p . Then we have from (|2.4p that 



= E^i/ /feoW'/'jWdz^iW- / /(x)ln/(x)dMx) 

(2.9) 

On the other hand, (12. 3p reads as 

/" /(x)ln/(x)d/i(x) > Ec, / /(p^.)(t)ln/(p^.)(t)dz.,(t) - D , (2.10) 

and so (|2.9p . and then (j2.4p applied on (Mj, z^j) with the probability density /(p^.) and the function 
(pj for each j < m, imply 

[ Yle^.Uv.i-)) < E^4 / f(pMt)Mt)di^jit)- [ /(p,)(t)ln/(,^,)(t)d^,(t) +Z? 



(2.11) 



Exponentiating both sides, we obtain (j2.2p . 

We next examine the relation between cases of equality in the two inequalities. 



□ 



2.2 THEOREM. Using the notation of the previous theorem, suppose that f is a probability 
density on 0, for which equality holds in the subadditivity inequality i2.3\) . Then the marginals 
/(pi)) /(p2)' • • • ' f{pm) of f yield equality in the Brascamp-Lieb inequality 112.^) . and moreover, f and 
its marginals satisfy 

m 

f = e-''\{{fi^p^){p,{xW ■ (2.12) 

Conversely, suppose that /i,...,/m are m probability densities (on Mj with respect to Uj for 
j = 1,... ,m, respectively) for which equality holds in the Brascamp-Lieb inequality \2.2\) . Then 
the probability density f defined on by 

n 

fix) :=e-^l[{f,{p,{xW^ 

i=i 

yields equality in the subadditivity inequality 12. 3\) and moreover fj is the jth marginal of f ; i.e. 
fj = f(pj) for j < m . 

Proof: Suppose that for some probability density /, J^iLi '^i ^{f{pi)) ~ ^if) — ^- Then with this 
/, we must have equality in the first inequality in (j2.6p . which comes from ()2.4p . By what we have 
said about the cases of equality in (j2.4p . this means that (j), defined in ()2.5p is a constant multiple 
of In/. Moreover, to get equality in (|2.7p . we were forced to choose = ln(/(p^-)). This ensures 
that (I2T2]) is true. 

Furthermore, to get equality in our intermediate application of the Brascamp-Lieb inequality, 
we must have that {/(pj), • • • , /(p„)} is a set of extremals for the Brascamp-Lieb inequality. 

The other assertion follows in the same way. □ 

By what we have just established, one could try to prove the classical Brascamp-Lieb inequality 
by first proving a general subadditivity of the entropy inequality for random variables in M". We 
do this in the next section, and shall see that the determination of all of the cases of equality 
is particularly transparent via this route. While the Brascamp-Lieb inequality and subadditivity 
inequality are equivalent, there is an extra richness to the investigation of the cases of equality in 
the subadditivity inequality, as this involves statistical independence in a crucial way. Some hint 
of this can be seen in the following simple example, which sets the stage for the next section: 

Let m = n, Cj = 1 for all j, and {oi, . . . , o„} be an orthonormal basis of M". Take all reference 
measures to be Lebesgue measure. Then the Brascamp-Lieb inequality reduces to an equality, 
by Fubini's theorem, with D = 0, and any set of non negative integrable functions {/i,...,/n} 
provides a case of equality. 

On the other hand the dual inequality, is the classical subadditivity of the entropy inequality 

m 

Y,S{X-ai) < S{X) , 
1=1 

and equality occurs exactly when the coordinates {X ■ ai, . . . , X ■ a„} form a set of independent 
random variables. 
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In this example, it may appear that the entropy inequality is the more complicated of the two 
inequalities. However, the fact that statistical independence enters the picture on the entropy side 
is quite helpful: We will make much use of simple entropy inequalities that are saturated only for 
independent random variables in our investigation of the cases of equality in the next section. 

3 The general subadditivity of the entropy inequaUty in 

Let be equipped with its standard Euclidean structure. Let X denote a random vector (or 
variable if n = 1) with values in M", and suppose that X has a density /. We denote this 
correspondence between the random variable X and its density / by writing X ^ f and set 

S{X)=S{f)= [ /(x)ln/(x)d"x. 

Thus, in this section, we are specializing the general context of the introduction to the case in 
which Q is M", and /i is Lebesgue measure. We shall also take v to be Lebesgue measure on M. 

Given a non zero vector a on M^, identify a with the linear functional a(x) = a ■ x. Then, if 
/ ~ X is a probability density on M", /(q), as defined by ()1.2p . is the density oi a ■ X, that is 
/(a) ~ a • ^, and 

5(X.a) = 5(/(„))= / /(,)(t)ln/(,)(t)dt. 
Jm. 

Note that (jl.2p specializes to the requirement that for every bounded and continuous (/> : R — > M, 

/ cPix ■ a)f{x) d^x = [ mf(a)it) dt . (3.1) 

JR" JM. 

It follows that for all t G M, /(„)(t) = ^ J^^.^^,^ f{x) d^-^x. It is a direct consequence of ()3.ip that 
for all A > 0, 

/{Aa)(i) = A-V(a)(A-4) . (3.2) 

With these preliminaries out of the way, we turn to the main question to be addressed in this 
section: Consider m non zero vectors ai, . . . , am in M", and m numbers ci, . . . ,Cm with cj > for 
all j. Then, we ask: 

Is there a finite constant D so that 

m 

CjS{aj ■ X) < S{X) + D (3.3) 

i=i 

for all random vectors X in R*^, and if so, what is the least such value of D, and what are the cases 
of equality? 

In general there is no finite constant D for which (j3.3p is true for all X. There are some simple 
requirements on {ai, . . . , a^} and {ci, . . . , Cm} for this to be the case. 

First of all, for (j3.3p to hold for any finite constant D, the set of vectors {ai, . . . , Om} must span 
R". The following construction is useful for this and other purposes: Let V be any proper subspace 
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of M", and let V'^ be its orthogonal complement. Then for any number A > 0, let Xy^x denote the 
centered Gaussian random vector (see below for definition) such that 

Vm G V, E{{u ■ Xv,xf) = A and Vu G V^, E((u • Xv,xf) = 1. (3.4) 

Then 

SiXv,x) = ln(27re) - ln(A) (3.5) 



while for any a in 



Sia ■ Xv,x) = -\ ln(27re) - ^ ln(A|Pa|2 + \P^a\^) , (3.6) 



where P is the orthogonal projection onto V, and P-^ = I — P. 

Now take V to be the orthogonal complement of the span of {ai, . . . ,am}- If the latter is a 
proper subspace of M", then dim(y) > 1, and we see that for any finite D, (jS.Sp would be violated 
for sufficiently large A, since then |-PajP = for each j. 

Beyond this spanning condition, there are some simple compatibility conditions that must be 
satisfied by the vectors Uj and the numbers Cj. First of all, it follows from (j3.2p that for all A > 0, 

S{XX) = S{X) - n ln(A) and S{a ■ \X) = S{a ■ X) - ln(A) . 

Therefore, (|3.3p can only hold when 

m 

^Cj = n. (3.7) 

j=i 

There is a further necessary condition that is somewhat less obvious. The key observation to 
make is that the right hand side of (13.60 tends to infinity as A tends to zero if and only if |P^ap = 0, 
Consider any subset J of {1, ... , m}, and let 

Vj := span{aj ; j £ J}. 

Let Gj denote the Gaussian random variable Xyj^x defined by ()3.4p when V = Vj. Note that for 
each j G J, |P-'-ajp = 0, so that for such j, 

S(.aj ■ Gj) = -^ln(2^e) - ^ln(|a,f ) - ^ ln(A) , 

which tends to infinity as A tends to zero. Therefore, letting A approach zero, we see that the 
leading term in "^JLi '^j'^i^j ' Gj) — S{Gj) is at least 



\ |^dim(Vj)-|^c,j ln(A) . 



(It is exactly this unless for some i ^ J, G Vj, in which case we could have taken an even "worse" 
set J.) Hence, if dim(Vj) — X^jgj Cj < 0, there can be no upper bound on SjLi CjS{aj ■ G) — S{G). 
Therefore, ()3.3p can only hold when it is the case that for all J, 

Y,Cj<dim{Vj) . (3.8) 
i6J 
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In particular, we must have cj < 1 for all j. 

We shall give a simple proof that these necessary conditions are sufficient. The following notation 
shall be used throughout the proof: Given any family {ai, . . . Om} of vectors spanning M", let 

A = [ai, ...,am] 

denote nx m matrix whose j'th column is aj. We shall also use A to denote the family {ai, . . . am} 
of spanning vectors. Thinking of A as the matrix of a linear transformation, computed in the 
canonical bases of M" and M™', will be useful in the proofs of several lemmas below. Note that A 
has full (row) rank. Next, let c denote the vector in M™ whose jth entry is cj. Finally, define the 
quantity D{A,c) by 



D{A,c) := sup < 

X 



Y,CjSiaj-X)-S{X) \ , (3.C 



where the supremum is taken over all random vectors X with values in M" and with finite entropy. 
A random vector X for which this supremum is attained will be said to be extremal and will be 
called an extremizer. 

Notice that with A fixed, D{A, •) is the pointwise supremum of a set of affine functions, and as 
such, it is convex. We introduce 

Ka ■■= {c G [0, 1]™ ; c verifies 1^ and ^E) VJ C {1, . . . ,m}|, (3.10) 

which is clearly a convex subset of the hyperplane of defined by (j3.7p . As we have seen, D{A, c) 
is infinite outside Ka- We shall also need later to distinguish the interior of Ka relative to the 
intersection of [0, 1]™ and the hyperplane specified by (|3.7p : 



K°, 



c G Ka < dim{Vj), VJ C {1, . . . ,m}, J / 



> . (3.11) 



We shall make an extensive use of the fact that Ka and K'^ are invariant under linear transforma- 
tion, in the sense that for any invertible linear operator T on M", we obviously have K^a = J^A 
and K^^ = K^ with the notation TA = [Toi, . . . , Ta^,] when A = [ai, . . . , am\- 
Also define Dg{A,c), the Gaussian analog of (|3.9p . by 

Dg{A, c) := sup I ' ^) " | • (3-12) 

in which the supremum is taken over all centered Gaussian random vector G with values in . By 
a centered Gaussian random vector, we mean one that has a density of the form 

1 / 1 \ "/2 



det(C7)| V27r 



e 



for some symmetric invertible matrix G on M"'. More generally, a Gaussian random vector is a 
random vector of the form xq + G with xq G and G a centered Gaussian random vector. We can 
restrict ourselves to centered random vectors because the entropy is invariant under translation. A 
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Gaussian random vector is said to be isotropic if its covariance matrix is a multiple of the identity; 
it is said to be standard if it is centered and if its covariance matrix is the identity (i.e. it is a 
7\^(0, Id) Gaussian vector). 

At this point, it is important to note that all the definitions made so far make sense more 
generally on a finite dimensional Euclidean space (E, •). We have made the identification E = M", 
which has the advantage to allow us to work with matrices. Later, we shall also need to work on 
subspaces of M" , which are then canonically equipped with the Euclidean structure inherited from 
M"; we then need to work with the corresponding Euclidean versions of the notions introduced 
above. 

It is clear that Dg{A, c) is also a convex function of c, and that Dg{A, c) < D{A, c). Also, since 
our proof that D{A, c) = oo for c ^ used a centered Gaussian random vector, it shows also that 
Dg{A, c) = oo for c ^ Ka- In fact, we have the following: 

3.1 THEOREM. For every family A = {ai, . . . ,am} of m vectors spanning and every vector 
c in M"* with < Cj < 1 for all j, we have 

D{A,c) = Dg{A,c) , 

and furthermore D{A, c) is finite if and only if c £ Ka- 

The proof will be accomplished in three steps: 

Step 1: We shall first consider the case in which the vectors aj are all unit vectors Uj satisfying 
the following special condition, put forward by K. Ball in the setting of Brascamp-Lieb inequalities 
(see e.g. [Tj): 

m 

Cj Uj ® Uj = Id]R" , (3.13) 

i=i 

with Cj > 0. (Note that (j3.7p automatically holds, as it can be seen by taking the trace, and that 
Cj < 1 for all j < m.) Under this condition, we give a simple proof of Theorem 13.11 using an 
elementary superadditivity property of the Fisher information and integration along the heat flow. 
The proof here draws on ideas from [3] . 

Step 2: We shall show that for c G there is a linear change of variables that reduces this case to 
the one considered in the first step. While the lemma that provides the existence of the change of 
variables would appear to be a simple statement about linear algebra, the existence of this change 
of variables is intimately connected with the existence of Gaussian optimizers for the subadditivity 
(and hence the Brascamp-Lieb) inequality. 

Step 3: We show that on Ka\Kj^, the variational problem in (|3.9|) may be "split" into two problems 
of the same type, but each involving only a subsets of the original vectors, and integration over a 
proper subspace of M". Repeating this splitting operation, one eventually reduces to variational 
problems of the type considered in the second step. This step is modeled after a similar splitting 
argument developed in [11], but as we shall see, the entropic version has advantages that will help 
us determine all of the cases of equality. 

Remark: If one is content to prove only that D{A,c) is finite if and only if c S Ka, there is 
a very expeditious route: One can easily check the finiteness of D{A,c) at the extreme points of 
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c G Ka (where, as shown by Barthe, each Cj is either or 1). Then the convexity of D{A^ c) imphes 
finiteness on all of Ka^ and we know it is infinite outside. Proving the equality D{A^ c) = Dg{A, c) 
on all of Ka is more subtle: The values of D(A,c) and Dg{A,c) do jump as one crosses the 
boundary of Ka, and we see nothing to preclude D{A,c) from jumping up more than Dg{A,c)on 
the boundary. Thus, it is not only for the classification of the cases of equality that we argue as we 
do in the third step: we do not know of any quick way to "pass to the boundary" of Ka and wrap 
of the proof of Theorem 13.11 after the second step without developing the splitting argument. 

We now begin with the first step. Here we shall use a simple superadditivity result for the 
Fisher information: If X ~ / is a random vector with a differentiable density /, define the Fisher 
information of X or of / by 



I{X) = I{f) = '—j^. (3.14) 

This quantity is related to the entropy through the heat flow as follows: Let A denote the 
Laplacian on M", and let G denote a standard Gaussian random vector on R" independent of X, 
so that if / ~ X, 

Then we have the identity 

and in particular, the right hand side is finite for all t > 0. 

The basic inequality concerning the Fisher information that will yield us our subadditivity result 
is the fact that for any unit vector u, 

/(/(„)) =/(^-X)< / (3.15) 

JR" / 

with equality if and only if / is the product of /(^j) and a probability density g on the orthogonal 
complement of u. This was proved in |10j : see Theorem 2 there with p = 2. Let us include here for 
completeness a different proof taken from [5] (were more abstract settings are studied). This proof 
requires more regularity than the one in [10], but that is fine for our purpose, as we shall apply the 
inequality along the heat flow. 

Using the deflnition of the marginal (13. ip twice and Holder's inequality, we have: 



Hfiu)) = -//{«) (In /(„))" dt = - / /(x)(ln/(,))"(x-t/)d"x 

R JR" 

(/(„))'(x ■u){u- V/(x)) 




f{u)ix ■ U) 



■d"x 



Kfju))? , f {u- V/) 
(/(«)) 



JR" / 
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This proves (j3.15p . Equality in (j3.15p requires equality in Holder's inequality and so for some A G M 
we have (u • V)log/(x) = A(log • u) for all x £ M"; this A has to be 1 for equality to hold 
in (j3.15p and therefore f{x) = f(^u){x ' u)h{x — {x • u)u) for some probability density h on w^. 

From (jS.lSp . we immediately deduce the superadditivity of information. But before stating the 
result, let us make a definition needed to discuss the cases of equality. 

3.2 DEFINITION (Reducible spanning set). Let {oi, . . . ,am} be any set of m vectors spanning 
M". It is a reducible spanning set in case there are two proper suhspaces Vi and V2 of such 
that M" = Vi(B V2, and such that each aj belongs to either V\ or to V2. Otherwise, {oi, . . . ,0^} is 
called an irreducible spanning set. 

3.3 PROPOSITION. Consider any set {ui, . . . ,Um} of m unit vectors in M", such that there 
are numbers {ci, . . . ,Cm}, with < cj < 1 for each j < m, so that the decomposition of the 
identity i3.13\) is satisfied. Let G denote a standard Gaussian random vector. 

Then for all random vectors X with finite Fisher information, 

m 

J2cjI{uj-X)<I{X), (3.16) 
i=i 

with equality if X = G, and for all random vectors X with finite entropy 

m m 

Cj S{uj ■ X) - S{X) < J2 Cj S{uj ■ G) - S{G) = . (3.17) 
i=i i=i 

Moreover there is equality in these inequalities if and only if for each j < m, Uj ■ X and 
X — [uj ■ X)uj are independent. Under the condition that n > 2 and that {ui, . . . ,Um} is an 
irreducible spanning set, then there is equality in these inequalities if and only if X is an isotropic 
Gaussian random vector. 

Note that this proposition in particular implies that D{U,c) = Dg{U,c) = when U = 
[ui, . . . , Um] are unit vectors of M" and c = (ci, . . . , Cm) nonnegative real numbers satisfying (j3.13p . 

The proof of (j3.16p and (|3.17p is elementary and follows m . The determination of the cases of 
equality requires a bit more work, but it remains quiet direct (compared to analogous result on the 
side of the Brascamp-Lieb inequality). 

Proof: Inequality ()3.16p follows immediately from ()3.15p and condition ()3.13p rewritten in the 
form 

m 

VxeM", "^Cjix -Uj f = \x\'^. 
i=i 

Equality for X = G is obvious as G-Uj is a standard Gaussian variable and so the computation boils 
down to the equality ^ cj = n. (For the same reason the right-hand side of the inequality (j3.17p 
is zero.) 

As we have noted, the Fisher information of / is related to the entropy of / through 

— Sie^^f) = — /(e*^/). It is also easy to see (using that A commutes with translations) that if u 
at 

is any unit vector, then /(^), the marginal of / along u, has the property that {e^^f)[u) = c^^f{u) 
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where we keep the same notation of the 1-dimensional heat semi-group (Ag = g" in dimension 1); 
we again have (in dimension 1) that 

^5((e*^/)(„)) = -I((e*^/)(„)). 

Then since e*^/ ~ X + and because Y^Y=i ^j^i^j ' ~ ^{^) is invariant under dilation, 
i.e. under the substitution X XX, we get 

m 

^c,S{u,-G)-S{G) 

By Theorem 13.31 the integrand above is non negative for ah t, and so (j3.17p is proved. 

The condition for cases of equahty in (j3.15p tell us that there is equality in (|3.16p for a random 
vector X with finite Fisher information if and only if X verifies the following property (V): 

iV) \/i < m, X ■ Ui and X — {X ■ Ui)ui are independent. 

If G is a standard Gaussian random vector independent of X, then X verifies (V) if and only if 
for all t > 0, X + y/iG verifies (V). Thus for a random vector with finite entropy, there is equality 
in ()3.17p if and only if X verifies (V). 

Our goal is now to characterize, when n > 2, the random vectors verifying (V) under the 
assumption that {ui, . . . ,Um} is an irreducible spanning set of unit vectors. First note that if 
we prove that X + ViG is an isotropic Gaussian for all t > 0, then so is X. Therefore, using 
again the stability of the property (V), we need only consider random vectors X with smooth and 
strictly positive density. Secondly, we can assume that no two vectors of the family {'Uj}j<m are 
linearly dependent. Indeed, by keeping only one representative for the subspaces Wuj, we construct 
a subfamily of the vectors {ui}i<m which span R" and which remains irreducible. 

So from now let {ui, . . . , Um} is an irreducible spanning set of unit vectors of M" (n > 2), with 
no two vectors linearly dependent, and X a random vector verifying (V) and with a smooth density 
/ > 0. Thus for every i < m there exists two probability densities gi and hi, on R and uf- ~ R"~^ 
respectively, such that 

fix) = gi{x ■ Ui)hi{x - {x ■ Ui)x) 
Writing F = log f,Gi = log gi and Hi = log hi for each i < m, we have 

F{x) = Gi{x ■ Ui) + Hi{x - (x • Ui)x) , 

so that 

{ui ■ V)F(x) = G'iiu, ■ x) . 

Hence for any j ^ i, 

{uj ■ V){ui ■ V)F(x) = (ui ■ Uj)GUut ■ x) . 
Interchanging the roles of i and j, 

{uj ■ V){ui ■ V)F(x) = {ui ■ Uj)G]{uj ■ x) . 



m 



S{uj ■ X) - S{X) 



lie'^f) 



Hie'^f)iu,)) 



dt. 
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Evidently the left hand side depends on x only thorough Uj • x and only through uj ■ x. But since 
Ui and Uj are linearly independent, this means that the left hand side is constant. Hence, 



for every i / j, {ui ■ V){ui ■ V)F is constant. 

Furthermore, under the condition that {ui, . . . ,Um} is an irreducible spanning set, if any one 
vector Ui is removed from {ui, . . . , Um}, the remaining vectors still span M". For otherwise, since 
m > n > 2, we could take Vi to be the span of {ui}, and take V2 to be the span of {ui, . . . , Um}\ui, 
and we would have M" = Fi © V2- Thus each Ui decomposes in the generating family {ujlj^j and 
therefore, 

for every i,j < m, [ui ■ V)(uj • V)F is constant. 

But this implies that the Hessian of F is constant. Thus, X is Gaussian. To prove that this 
Gaussian is isotropic, let C be the covariance matrix of X. Then property (V) implies that each 
Ui is an eigenvector of C. Since eigenvectors of symmetric matrices are orthogonal if they have 
distinct eigenvalues, all of the eigenvalues must be the same unless there is such a "splitting" of 
M" into at least two (orthogonal) subspaces that together contain all of the vectors uj. This would 
contradict the hypothesis that {ui, . . . ,Um} is an irreducible spanning set. □ 

The following lemma will facilitate the application of the the statement concerning the cases of 
equality in Proposition 13.31 

3.4 LEMMA. Let A = {ai, . . . , Um} be any family of m vectors spanning M". If {oi, . . . , Om} is 
a reducible spanning set and D{A,c) is finite, then c ^ K'^. 

Proof: Let = Vi © V2 be a decomposition of into two proper subspaces such that each aj 

± 

is contained in one of them or the other. Let V be the orthogonal complement of Vi, = Vi (B V 
and let Xyx be the Gaussian random variable defined as in ()3.4p . Then by ()3.5p and (j3.6p . with P 
denoting the orthogonal projection onto V, 

f2c,S{ayXv,x)-SiXv,x) = -\( Yl c,ln(|a,|2)+ J] c, ln(A|Pa,f + |P^a,f ) j 

+ ^dim(y) ln(A) , (3.18) 

with Poj 7^ for j G V2, since Px = ^ x G Fi. Then, using that dim(y) = dim (1/2)) this 
expression (in A) has the form 

(dim(V2) - X] "^^l ^^^^^ + 
ms bounded in A > 1) , 



2 

which is unbounded for large A unless 



Cj = dim(V2) • 

This must be the case since by hypothesis that D(y4, c) < cxd. Thus, c ^ K\ □ 
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We have now completed the first step. We start the second by showing that the change of 
variables matrix R does exist for c S K'^. The existence of such a change of variables can be deduced 
from results of Bennett-Carbery-Christ-Tao [6j. However, the flow of logic in their deduction (and 
in [11]) runs counter to ours: They first show that such a change of variables exists whenever there 
are Gaussian optimizers for the Brascamp-Lieb problem, and then show that Gaussian optimizers 
exist for c S K^. Here, we need the change of variables at the outset of our analysis, and hence 
need a direct proof of this result. We now provide one, using a geometric result of Barthe. 

3.5 LEMMA. Let A = {oi, . . . , Um} be any family of m vectors that span M". Let {ci, . . . , Cm} be 
any m numbers verifying < Cj < 1 and satisfying ()3.8p . If c ^ K'^, then there exists an invertible 
symmetric n x n matrix R so that 

^ / R ' \ [ R \ 

When n > 2, there is exactly one such matrix R satisfying the further requirements that R be 
positive definite, and that trace(i?^) = n. On the other hand, for c ^ Ka, no such matrix R exists. 

Remark: After settling the cases of equality in Theorem lS.ll we shall derive necessary and sufficient 
conditions for the existence of such a matrix R. Though the conditions are simple and explicit, it 
turns out that the matrix R exists if and only if the supremum in (j3.12p is attained at some centered 
Gaussian G, and our proof that the conditions we give are necessary and sufficient depends on this. 

Proof: Take any diagonal m x m matrix S with positive diagonal entries Sj, j < m, and define 
the n X n matrix Rs by 

Rs = {{AS){ASf)-^/^ . 

This makes sense since {AS){ASy is a positive definite nxn matrix. Notice that {RsAS){RsASY = 
Id]Rn, or, what is the same 

m 

s'jRsaj Rsttj = Id]8n. 

i=i 

Therefore, 

g.,.(^H.»,)«(^fl.„,)=/. 

Sj 

We have what we seek if and only if for each j, — —Rsuj is a unit vector, which is the case if and 
only if for each j, cj = s'j\Rsaj\'^. By the definition of Rs, this means 

Cj = ej ■ [{AS)\{AS){ASYy\AS)]ej (3.20) 

where {ei,...,em} denotes the standard orthonormal basis in M*". Note that Cj ■ 
{ASy {{AS){ASY)^^ {AS)ej is also the jth diagonal entry of the orthogonal projection in onto 
the image of {ASy. 

It has been shown |2J (see |TT] for another proof and a statement in this formulation) that there 
exists positive numbers si, . . . ,Sm for which (|3.20p is true whenever c G K"^, and that in this case. 
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when re > 2, the set of numbers is unique up to a common multiple. Thus, for c G such an R 
exists. 

As for the uniqueness, note that given any such matrix i?, we can change variables, replacing 
X R~^X and aj uj := \Raj\~^ Raj. Then Proposition 13.31 may be applied to deduce that the 
only extremizers for the new problem are isotropic Gaussians. Undoing the change of variables, we 
see that the only extremizers of the original problem are Gaussians whose covariance is a multiple 
of R^. Thus, under the further condition that R be positive definite (instead of simply symmetric), 
and that the trace of R^ is fixed, R is uniquely determined. 

The same change of variables argument (which is exploited systematically in Lemma [3.61 b elow) 
shows, through Proposition 13.31 that if such a matrix R exists, then D{A,c) < oo. As we have 
seen, this is impossible when c ^ Ka- Q 

Remark: The first proof that there exists a solution, essentially unique, to (|3.20p whenever c £ K"^ 
is due to Barthe 12J. However, he used a different characterization of K^, and did not mention the 
condition (|3.8p . Another proof of this, based directly on (|3.8p was given in pT], together with a 
proof that the characterization of Ka in Barthe's paper is equivalent to the one based on ()3.8p . 

With the change of variable provided by the previous lemma, we can finish the second step and 
describe what happens when c G K'^. 

3.6 LEMMA. For any family A = {ai, . . . , Um} of m vectors spanning M", and all vectors c in 

D{A,c)=Dg{A,c), 

and there exist a Gaussian optimizer. Moreover, if n>2, then X^JLi ^j^i^j ~ ^{^) — -^(^ic) 
if and only if X is Gaussian and its covariance is a constant multiple of R? where R is the unique 
positive definite matrix verifying p.l9p with Tr{R^) = re. 

Remark: The condition "re > 2", which has already appeared several times, is present because in 
one dimension, the subadditivity problem is trivial, so that Gaussians play no special role. Indeed, 
assume we are given ci, . . . , Cm > with the condition that c j = 1 and A = {oi, . . . , Om] a 
family of non-zero real numbers. Then, setting 

m 

D := -^Cjlog|aj| 
we have, for every random variable X on M with finite entropy 

m 

^CjS{ajX) - S{X) = D. 
i=i 

Therefore D{A,c) = D and every random variable X is an extremizer. 

Proof: Let R be an invertible symmetric matrix verifying ()3.19p provided by the Lemma [3. 51 Since 
for any random vector X with finite entropy, we have 

S{X-aj)=s(^^-R-^X]-\n{\Raj\) and S{X) = S{R-^ X) -\n{\dei{R)\) , 
\\Ra.j\ J 
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we obtain 




m m 



3=1 



Introduce the family of vectors uj : = 
equality implies that 



Raj 



for j < m, and set U = [ui, . . . ,Um]- The previous 



Raj 



m 



D{A, c) = D{U, c) - ^ Cj lni\Raj\) + ln(| det(i?)|) . 



(3.21) 



j=i 



Thus we are reduced to studying the problem determining D(U,c) and the extremizers there (noting 
that X is an extremizer for D{A,c) if and only if R^^X is extremizer an for D{U,c)). Note also 
that since the vectors {ui, . . . , Um} are obtained from the vectors {ai, . . . , am} by a non singular 
linear transformation, they span M", and we have K^ = B c. 

Since U = [ui, . . . ,Um] is a family of unit vectors verifying the decomposition of the iden- 
tity ()3.13p . we can apply Proposition 13.31 and get that 



and every isotropic Gaussian vector is an extremizer. To prove that all optimizers are Gaussian 
when n > 2, note first that, by Lemma 13.41 c G implies that {-ui, . . . ,Um} is an irreducible 
spanning set. Therefore any optimizer of the variational problem defining D(U,c) is an isotropic 
Gaussian. (Then every optimizer for D{A, c) is Gaussian whose covariance is a multiple of i?^.) □ 

Remark: Note that the proof above gives also the following statement: If there exists an invertible 
matrix R verifying (j3.19p then (with no further assumptions on c and ^4) we have that D{A, c) < +00 
and that RG is an extremizer for every standard Gaussian vector G. 

We now turn to the third step. When c £ Ka\K'^, we will pick a non-empty proper subset J of 
{1, . . . , m} of least cardinality among subsets for which equality holds in (j3.8p . We shall now show 
that the variational problem defining D{A,c) splits into two such problems involving fewer vectors 
and random variables in a lower dimensional space. Repeated splittings, and what we have already 
proved, will enable us to settle all questions concerning the variational problem defining D{A,c). 
The splitting argument presented here is patterned on one developed in [H] for the Brascamp-Lieb 
inequality. However, as we shall see, in the subadditivity setting, the argument leads to a clear and 
simple analysis of cases of equality. It relies on properties of the conditional entropy. 

As mentioned at the beginning of this section, we shall need to work on subspaces of M"- and 
thus make use of the definition made above in the setting of Euclidean spaces. For a given family 
A = {vi,...Vk} of vectors on M", we introduce the Euclidean subspace E := spaji{vi, . . . ,Vk) 
equipped with the induced Euclidean structure from M" (i.e. the scalar product is the same). For 
real numbers ci,...,Cfc with < Cj < 1, the quantities D{A,c) and Dg{A,c) are then implicitly 
assumed to be defined on the Euclidean subspace E (the random vectors live on E and the entropies 



D{U,c) = Dg{U,c) = < 00 



(3.22) 
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are computed with respect to the Lebesgue measure on E, where the laws of the vectors hve). 
Accordingly, the set Ka is to be understood as 

k 

Ka := {c G [0, 1]'' ; = dim{E) and <^ holds VJ C {1, . . . , A:}}. 

j=i 

Let us fix the following notation. Let A = {ai, . . . , am} be a family of of m > 1 vectors spanning 
an Euclidean space E, 

E = span {{aj ; j G /}) 

(in a first step we shall have E = M"). For family of m real numbers c G Ka and a non-empty proper 
subset J of {1, . . . ,m} for which equality holds in (j3.8p . denote by Pj the orthogonal projection 
onto Vj = spanjoj ; j G J} and let Pj- = Ue — Pj be the complementary projection. Define, for 
j G J'^ := {i G {1, . . . , m} ; i ^ J} the vector bj = Pj~aj and 

Aj = [aj ; i G J] and Bjc = [hj ; j G 

the ordered (by ordering J and as increasing subsequences of 1, ... , m) families of vectors {aj)j^j 
and {bj)j^jc. For any subset K of {1, . . . ,m}, and c G M*", let ck denote the vector of rI^I whose 
coordinate are the {cj)jeK (K being written as an increasing subsequence of 1, ... , m). Since there 
is equality in (j3.8p for J, we have 

cj G Kaj ■ 

Note that Vj + Vjc = E [a priori this sum is not direct) and so Vj- = Pj-Vjc. Thus we have 
=span({6j : j G J^}), i.e.: 

± 

E — span ({oj ; j G J}) © span({6j ; j G -7'^}) . (3.23) 

And we also have 

cjc G Kbjc ■ 

Indeed, using (j3.23|) and equality in (j3.8|) for J, we have Cj = dim(span{6j ; j G J'^}), and also 

for J C J'^, since -Pj^Oj = for j G J, 

ZJcj + dim(V:7) = Cj 
jeJ jeJuJ 

< dim(span{aj ; j £ JU J}) 

= dim(span{ Pj aj + Pj~aj ; j £ J U J}) 

< dim(span{ Pj aj ; j £ J U J}) + dim(span{Pj^aj ; j £ J}) 
= dim(Vj) + dim(span{6j ; j G J}) 

For an invertible operator T on we shall use the standard notation 

where T*x ■ y = x ■ Ty for all x,y £ R". With these definitions, we now state the splitting lemma. 
Only the first part of the statement is needed to complete the proof of Theorem 13.11 ; the rest will 
be used for the characterization of extremizers. 
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3.7 LEMMA. Given any family A = {ai, . . . , am} of m vectors spanning M" and c G Ka \ K'^ 
with Cj > for all j < m, let J he a non-empty proper subset of {1, . . . , m} for which equality holds 
in \3.^) . and suppose that J has the least cardinality among all such subsets. Then with Aj, cj, 
Bjc and cjc defined as above, we have 

DiA,c) = D{Aj,cj) + D{Bjc,cjc) , (3.24) 

and if Dg{Bjc, cjc) = D{Bjc, cjc), then Dg{A, c) = D{A, c). 

Suppose next that there exists an extremizing random vector X; i.e., a random vector X such 
that 

m 

Cj S{aj ■ X) - S{X) = D{A, c) . (3.25) 

i=i 

Then 

W = Vj® Vjc . (3.26) 

and this direct sum is an orthogonal decomposition in the inner product given by the covariance 
matrix of X ; i.e., {x,y) =E[{x ■ {X --EX)){y ■ {X - EX))]. 

Moreover, ifT is an (invertible) operator on R" such that one has the orthogonal decomposition 

= TVj e TVjc 

1 /2 

(for instance T = H-^ where Hx is the covariance matrix of an extremizer X, so that (x, y) = 
X ■ Hxy), then X is an extremizer (|3.25|) if and only ifT~*X decomposes as T^*X = Y -\- Z where 
Y and Z are independent random vectors with values in TVj and TVjc, and which are extremizer 
for [[Toj ;j G J],cj) and [[Taj ]j G J'^],cjc^, respectively. 

The proof of this lemma relies on some well known identities and inequalities concerning con- 
ditional entropy that we now recall. 

Let E and F be two Euclidean spaces (equipped with the Lebesgue measure). If W and Y are 
two random vectors with values in E and F respectively, with a joint density p{w, y) on E x F, 
let pviy) = p('U^; y) dtf and pwiw) = Jp p{w,y) dy be the two marginal densities on F and E, 
which are of course the densities of W and Y respectively. 

Then the conditional density ofW given Y is p{w\y) = p{w , y) / py {y) ■ The conditional entropy 
of W given Y = y is then defined to be 

S{W\Y = y)= / p{w\y) In p{w\y) dw . 
Je 

Since the entropy of {W,Y), S{W,Y), is given by 

S {W, Y) = p{w, y) In p{w, y) dw dy , 

J ExF 

the identity 

SiW, Y) = SiW\Y = y)pY{y) dy + S{Y) (3.27) 
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follows directly from the definitions. Furthermore, by Jensen's inequality 



S{W) < / S{W\Y = y)pY{y)dy 

<E 



(3.28) 



and there is equality if and only if W and Y are independent. 

Proof of Lemma l3.7t Fix any random vector X with values in M" and suppose that S{X) is finite. 
We shall use the definition and notation given before the Lemma. Let Pj denote the orthogonal 
projection onto Vj, and recall that we have the decomposition (j3.23p . so that Pj- = IdRn — Pj is 
also the orthogonal projection onto span({6j : j G J'^}) where bj = Pj-{aj) for all j G J'^. Let us 
introduce 

Y = PjX and Z = Pj-X 
so that X = Y + Z. Then S{X) = S{Y, Z) and so from ^^7FJ\ . 



S{X)= [ S{Z\Y = y)pY{y)dy + S{Y). 
JVj 

For each j G J, we have aj ■ X = aj ■ Y, so that 

S{aj • X) = S{aj • Y) for j G J . 



(3.29) 



(3.30) 



Note that for j G J'^, bj 7^ 0, or else aj G Vj; but this is impossible since Cj > 0, and we already 
have X]j=i Cj = dim(Vj). We have, using the invariance of the entropy under translation, 

S{aj ■ X\Y = y) = S{aj ■ Z + aj-y\Y = y) = S{bj ■ Z\Y = y) for j G . (3.31) 

Therefore, by applying (|3.28p to {X ■ aj,Y) on M x Vj, we get 



S{ayX)< [ S{bj-Z\Y = y)pY{y)dy for j G J= 
JVj 



(3.32) 



Now combining (fSTBO]) and (fO^jl . we have that 



m „ 

Y,CjS{ayX)-S{X) < / 



^ c,S{bj ■ Z\Y = y)- S{Z\Y = y) 

j€J'= 

+ ^c,S{a,.Y)-S{Y) 



PYiy)dy 



(3.33) 



It is clear from ()3.33p and the definition of D{Bjc, cjc) that 

D{A,c) < D{Aj,cj) + D{Bj.,cjc) . 

To see that there is actually equality here, we use the fact that J is a critical set of minimal 
cardinality. This implies that cj G K"^^, and by Lemma 13.61 there is a centered Gaussian random 
vector Y for which 

^ CjS{aj ■ Y) - S{Y) = D{Aj, cj) . (3.34) 
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Pick e > and let Z be any random variable with values in Vj that is independent of Y and such 
that 

J] CjS{hj ■ Z) - S{Z) > D{Bjc,cjc) - e . (3.35) 

For 5 > 0, form the valued random vector X = 6Y + Z. Since Y and Z are orthogonal and 
independent, S{X) = S{6Y,Z) = S{5Y) + S{Z). The scaling invariance implies that (j3.34p holds 
when Y is replaced by 6Y. Also, for j £ J^, as 5 approaches zero, S{aj ■ X) = S{bj • Z + daj ■ Y) 
approaches S{bj ■ Z). (Note that by the independence of Y and Z, bj ■ Z + 5ajY is simply a standard 
Gaussian regularization of bj ■ Z.) It now follows that for 6 sufficiently small, 

m 

^CjS{aj ■ X) - S{X) > D{Aj,cj) + D{Bj.,cj.) - 2e . 
i=i 

This implies that D{A,c) > D{Aj,cj) + D{Bjc^cjc). We have implicitly assumed that 
D{Bjc^cjc) < +CXD (we shall later only need this case, actually), but the argument remains valid if 
D{Bjc,cjc) = +00. Thus ([3:2i]) is established. 

Now suppose that Dg[Bjc^cjc) = D{Bjc,cjc). Then we may further assume that the random 
variable Z in the previous paragraph is a centered Gaussian random variable. Combining this with 
the independent extremal centered Gaussian random variable Y, provided by Lemma 13.61 we see 
that we may take the random variable X in the previous paragraph to be a centered Gaussian. 
Hence, in this case, Dg{A,c) = D{A,c). 

It remains to prove the last statements concerning the cases of equality. 

We first assume that we are given a finite entropy random variable X for which ()3.25p is satisfied. 
By making a translation, we may assume that X is centered; i.e., E(X) = 0. Furthermore, the 
covariance matrix is non-degenerate or else the law of X would be concentrated on a proper subspace 
and this is inconsistent with finite entropy. Since X satisfies (j3.25p . there must be equality in (j3.33p . 
and it must be the case that 

CjS{aj ■ Y) - S{Y) = D{Aj, cj) (3.36) 

and that for each y £ Vj, 

^ CjS{bj ■ Z\Y = y)- S{Z\Y = y) = D{Bj.,cj.) . (3.37) 

And since X is centered, so is y. Next, in addition to equality in ()3.37p . we must have equality 
in ()3.33p . Since the only inequality used in deriving (|3.33p was (|3.32p , this in turn requires equality 
in (j3.32p for each j G J^. By (j3.3ip . this means that for j E J'^, 

S{aj ■X)= f S{aj ■ X\Y = y)pY{y) dy . 
JVj 

By the condition for equality in (j3.28p . this implies that for j G J^, aj ■ X and Y are independent 
random variables. But then for any y £ Vj, by independence 

(y, a,) = E[{y ■ y)(a, • X)] = E{y ■ y)E(a, ■ X) = . 
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This shows that Vj and Vjc are orthogonal subspaces in the inner product defined in terms of the 
covariance. Thus their dimension sums exactly to n and so (j3.26p holds. 
We now prove the final statement describing how extremizers split. 

Note that given invertible operator T on M"', a random vector X is extremal (j3.25p for {A,c) 
if and only if T~*X is extremal for {TA, c) with the notation TA = [Tai, . . . , Tum]- Indeed, since 
aj-X = Tttj ■ T-*X and ^(r-*^) = S{X) + ln(| det(T)|) we have that <K25\i is equivalent to 

m 

Cj S{Taj ■ T-*X) - S{T-*X) = D{TA, c) 

i=i 

and D{TA,c) = L>(^, c) - ln(| det(r)|). 

As in the statement of the lemma, let T be an invertible operator on such that = TVj © 
TVjc. The previous remark explains the mechanism of replacing A by TA and X by T^*X. So 
after this transformation we are reduced to proving the statement in the case T = Id. Therefore 
we assume from now on that 

M" = l/j © Vjc. 

We go back to the beginning of the proof and note that bj = aj for all j € J^: the orthogonal 
projection does nothing in this case {Pj~ = Pj^). 

Assume X is an extremizer (j3.25p which is decomposed as before as X = Y + Z . Then as in 
the argument above we must have that 

CjS{aj ■ Y) - S{Y) = D{Aj, cj) (3.38) 

and that for each y £ Vj, 

Y CjS{aj ■ Z\Y = y)- S{Z\Y = y) = D{Aj.,cj.) , (3.39) 

with Y and aj ■ X independent for every j G J'^. Since aj ■ X = Uj ■ Z for every j £ J'^ we have 
that Oj • Z is independent of Y for j G and so S{aj ■ Z\Y = y) = S{aj ■ Z). Using this together 
with (j3.28p for W = Z, we get, after integrating (j3.39p with respect to pY{y)dy, and applying 
([328]), 

D{Ajc,cjc) < Y CjS{aj ■ Z) - S{Z) . 

By the definition of D(Ajc,cjc) this inequality must be an equality, i.e. 

Y CjS{aj • Z) - S{Z) = D{A jc,cjc) , (3.40) 

and therefore, there must be equality in the application of ()3.28p that we just made. This implies 
that Z and Y are independent, as claimed. 

± 

Conversely, let X be a random vector such that X = Y + Z in the decomposition = Vj © Vjc 
with Y and Z independent and such that (j3.38p and (j3.40p holds. Then we have ()3.39p and we 
readily check that there is equality at every step. So X is indeed an extremizer ()3.25p . □ 

We are now ready to prove Theorem l3.lt 
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Proof of Theorem 13.11 By Lemma 13.6^ whenever c E Dg{A,c) = D(A,c), and there is a 
Gaussian optimizer. 

Hence it remains to consider the case c G Ka \ K'^. Then taking J to be a proper non-empty 
subset of {1, . . . ,m} of least cardinality for which there is equality in (j3.8p . we may "peel off" \ J\ 
vectors from our set, as in the first part of Lemma 13.71 and reduce maters to the consideration of 
D{Bjc,cjc). By that Lemma, Dg{A,c) = D{A,c) whenever Dg(Bjc,cjc) = D{Bjc,cjc). Now, if 
B jc and cjc are such that for every proper subset of the remaining indices, strict inequality holds 
in the analog of (|3.8p . i.e. cjc G ^Ajc^ then Dg(Bjc,cjc) = D{Bjc,cjc) follows from Lemma 13.61 
Otherwise, we "peel off" another proper subset of indices for which equality holds in (j3.8|) . and 
reduce to a problem with a strictly smaller number of vectors. In a finite number of steps, this 
process must end. □ 

Our next theorem concerns the cases of equality in the subadditivity inequality. As we have 
seen in Lemma 13.71 when there is equality, and no Cj is zero, then either c G K'^, or the variational 
problem can be split into two problems of the same type, but involving reduced number of vectors, 
and for random variables taking values in subspaces of a reduced dimension. 

Of course, each of these reduced problems must also have an optimizer, and so we can apply 
the same dichotomy to each of them. This leads to the following definition: 

3.8 DEFINITION (Totally reducible for c)). Let A = {ai, . . . ,am} be a family of vectors that 
spans M" and {ci, . . . ,Cm} a set of real numbers with < Cj < 1 for j < m. We say that 
{oi, . . . , a^n} is totally reducible for cifcG Ka and in case for some k>l there is a decomposition 
(possibly with k = 1) 

{l,...,m} = JoU JiU...U Jfc 

where j G Jq if and only if Cj = 0, and 

= K/i e • • • © Vj^ with Vj^ = span({a^ : £ G JJ) , 

such that for each 1 < i < k, there is no nonempty proper subset of Ji that yields equality in i3.8\) . 
Here, Jq may be empty, but for 1 < i < k, Ji is to be non empty. 

Note that, if {ai, . . . ,am} is totally reducible for c, then we have, with the notation of the 
definition, that for 1 < k < m, 

The analysis made so far proves the following theorem, which gives a complete analysis of the 
cases of equality in the subadditivity inequality. 

3.9 THEOREM. Consider a family A = {ai,...,am} of vectors spanning M". Then for any 
c G Ka, there exists a finite entropy random variable X for which 

m 

^ CjS{aj ■ X) - S{X) = D{A, c) , (3.41) 
i=i 

if and only if {oi, . . . , a^} is totally reducible for c. In this case, if = Vj^ © • • • © Vj^. is the 
corresponding decomposition ofW^ from definition \3.8l let T be any symmetric positive operator on 
M" such that the following orthogonal decomposition holds 

R" = TVj^ © • • • © TVj^ . (3.42) 
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Then the extremizers (j3.4ip are exactly the random vectors X such that T '*X decompose as 

T~*X = Xi + --- + Xk 

where {Xi, . . . is an independent set of random variables with each Xi taking values in TVj- 

and extremal for the corresponding problem {{[Taj ;j G Jj],cj.). More precisely, for each i < k, 
f/dim(Vj-) = 1, then Xi can be any finite entropy random variable with values in TVj-; However, 
if dim{Vj.) > 1, then Xi is necessarily Gaussian, and its covariance is a constant multiple of Rf, 
where Ri is the unique positive definite linear transformation on TVj^ such that 

Finally, if X is an extremizer (I3.4ip then the symmetric positive operator T defined by x • T'^y = 
E[{x ■ {X - EX)){y • {X - EX))] satisfies the required condition (I3:i2]l . 

Proof: The proof relies on successive applications of the Lemmas 13.71 and 13. 6i First of all, note 
that the vectors aj for the indices j such that Cj = play no role in the inequality, and so without 
loss of generality, we may discard these indices without changing D{A,c), the extremizers and K^. 
So we will assume that cj > for all j <m (this means Jq = in the Definition I3.8p . 

Assume there exists an extremizer X, which, after translation, can be assumed to be symmetric, 
and let T be the symmetric positive operator on M" defined by Tx ■ Ty = K^{x ■ X){y ■ X)] . As 
explained in the proof of the Lemma 13.71 the change of vectors X T~*X and aj Taj reduces 

the problem to the case T = Id, which means that X has unit covariance. Then from Lemma [3 .71 we 

± 

have M"' = Vj, © Vjc for some set of indices Ji , with c j, G K°, and cjc G Ka ,c , and moreover in 
this orthogonal decomposition X = Xi + Z with Xi and Z extremal for {Aj-^ , cj^ ) and {Ajc , cjc ) , 

respectively. We apply then Lemma 13.71 on the space Vjc to the vector with unit covariance Z 

± 

which is extremal. This gives for some J2 C Jf another orthogonal decomposition Vj^ = Vj^ © Vj| 
where J| = {j G Jf ; j ^ J2} with cj^ G K'^^ . After a finite number k of step this process muss 
end and we have 

= © • • • © Vj, 

with cj. G K"^ for i < k. This shows that there exists an extremizer only when {ai, . . . , am} is 
totally reducible for c. Note that we have also shown that this sum is orthogonal w.r.t. the scalar 
product given by the covariance of an extremizer. 

We assume from now that {oi, . . . , Om} is totally reducible for c and that = Vj^ © • • • © Vj^. 
is the corresponding decomposition of from definition 13.81 We can assume that | Ji | < | J2 1 < 
• • • < \Jk\- Let T be any symmetric positive operator on such that the following orthogonal 
decomposition holds 

= TF7i © ••• ©TK;,. 

Of course, there always exists such a linear map T. As before the change of vectors X T^*X 
and ttj — > Tttj reduces the problem to the case T = Id and 

M" = yj, ©•••©Fj,. 
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With this orthogonal decomposition in hand, we can use Lemma 13.71 to successively "peel-ofF' 
orthogonal blocks. We first apply this Lemma to Ji and = J2 U . . . U Jjt, and then on the 

space Vjc = Vj^ = Vj^ © ••• © Vj^ to J2, and so on. After k steps we get that D{A,c) = 
D{Ajj^,cj-^) + . . . + D{Aji^,cji^) and that a random vector X is an extremizer if and only if it can 
be written as 

X = X, + ... + Xk 

where Xi has values in Vj- and is extremal for (Aj-,cj^), and with the property that 

Xi is independent of (Xi^i, . . . , Xf^.) , for i = 1, . . . , A; — 1. (3.43) 

(Note that in order to construct and extremizer X we start with an extremizer X^ on Vjf. and, 

± 

then add an extremal independent on Vj^. in order to get an extremizer on Vjj._j © Vj^,, and 

so on by repeated applications of Lemma [377|) . Observe that the independence property (|3.43p is 
equivalent to the independence of the set of random vectors {^1, . . . ,Xk}- Next remember that 
for each i < m we have cj. G -^j • Thus Lemma 13.61 applies and when dim(Vj.) > 1 then Xi is 
Gaussian and its variance is imposed as stated. Recall that in dimension 1 the problem is trivial 
and all random variables are extremal (in particular Gaussian variables are extremal). 

□ 

Note that the previous theorem tells in particular that when there exists optimizers, there exists 
Gaussian optimizers (however this was not a needed step in our approach). 

Of course, by Theorems 12.11 and 12. 2[ we now also know that optimizers for the classical 
Brascamp-Lieb inequality exist under the exact same conditions for optimality described in The- 
orem 13.91 a-iid that moreover, the optimizers Brascamp-Lieb inequality are exactly the marginals 
of the optimizing probability densities for the subadditivity inequality. The full description of 
optimizers (in one dimensional Brascamp-Lieb inequalities) was given in [llj . building on a previ- 
ous characterization by Barthe [2]. In the multidimensional case, building on Barthe's work too, 
Bennett-Carbery-Christ-Tao [6] obtained some description, but the problem was completely solved 
only recently by Valdimarsson |13j . 



4 Consequences of the general subadditivity inequality in M" 

There are several interesting consequences of Theorems 13.11 and 13. 9[ The first is a generalization of 
Hadamard's inequality for determinants: 

4.1 THEOREM. Consider any family A = {ai, . . . ,am} of m vectors that span R", any set of 

numbers {ci, . . . ,Cm,} with < q < 1. Then with D{A^c) as above, for any linear transformation 
T from W to M", 

|det(r)|<e^(^'^) l^ni^KOr^j > (4-1) 

and this inequality is sharp in that the constant e^^^'^'' cannot be decreased. Moreover, for c E K"^, 
there is transformation T with det(T) = 1 for which equality holds in ( [^. j[ ),and, when n >2, if we 
take T to be positive, then T is unique (up to multiplication by a positive scalar). 
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Remark: In the case that m = n, and the vectors {oi, . . . ,a.m} are an orthonormal basis, and 
ci = • • • = c„ = 1, this reduces Hadamard's inequahty for determinants. In the special case 
Ylm=i '^j^j '^'j ~ I^R", SO that D{A,c) = 0, this result has been proved by Ball [Ij, with a very 
simple proof. 

For simplicity we have stated the existence of an extremal T only when c G K'^, but the right 
condition is that A is totally reducible for c, just as in Theorem 13.91 



Proof: By making a polar decomposition, we may assume without loss of generality that T is 
positive definite. Let Gt be the centered Gaussian random variable with Fi^u-Gt) = |T(n)p for all 
vectors u in M". Then simply evaluating the left hand side of X^jli CjS{a ■ Gt) — S{Gt) < D{A, c), 
we obtain (|4.1|) . Then Theorem 13.11 provides the rest. □ 

Theorem 14.11 gives us one simple variational expression for D{A,c), namely 

D{A,c) = sup jln ■ ^ positive definite | . 

There is however a simpler variational formula for D{A, c) over an even lower dimensional space, 
as suggested by the fact that e^^^''^^ is also the sharp constant in the Brascamp-Lieb inequality. 
By the classical theorem of Brascamp and Lieb, e^^^''^^ may be computing by taking the functions 
{/i, . . . , fm} in the Brascamp-Lieb inequality to be centered Gaussians; i.e., 

{/l(^),...,/»^(^)} = {e-(^l*)^...,e-(^™*)'} , 

and varying the m numbers si, . . . , Sm- This leads directly to the variational expression (|4.2|) for 
D(A,c). Let us recall that the existence of optimizers for this problems was proved by Brascamp 
and Lieb [9] under the hypothesis that every set of n vectors chosen from {ai, . . . , am} is linearly 
independent and later proved by Barthe [2] for c S K'^. The next theorem gives the complete 
result. Although the variational formula (j4.2p can be deduced by duality, we give a direct proof of 
it starting from the subadditivity inequality. 

4.2 THEOREM. Consider any set {ai, . . . ,am} of m vectors that span M"", n > 2. Let 
{ci,...,Cm} he any set of numbers with < Cj < 1 verifying (|3.7p . Let T denote the m x m 
diagonal matrix whose jth diagonal entry is tj, and define the function <I>^(ti, . . . ,tm) by 

=lndet(Ae^^*) . 

This is a convex function on M*", and 

L'(^,c) + ^Cjln(cj) = - sup i^Cjtj -^A{ti,...,tm)] ■ (4.2) 

j = l ^ {tu-,tm} J 

The supremum in is attained if and only if {ai, . . . , a^} is totally reducible for c. Moreover, 

(m I m \ \ 

^ Cjtj - 2 D{A, c) + Y^ Cj Hcj) ■ (4.3) 
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Proof: For an mxm diagonal matrix S with positive entries Sj, introduce Rs ■= {{AS){ASY) ^^'^ 
as in the proof of Lemma 13. 5i Let G be a standard Gaussian random vector on (i-e., G £ 
AA(0,Id)), and set Gs = RsG. Then 



m ^ m 

J2 c,S{a, ■ Gs) - S{Gs) = - ln(det(i?^i)) - 3 E l^d^^a.f ) • 

However, 

l^sail' = sf\Rs{sjaj)\^ = sf\Rs{SA)ej\^ = sfe, ■ {AS)\{AS){AS)'r\AS)ej , 

where ej is the jth standard basis vector in M™. Recall that ej ■ {ASY{{AS){ASY)~^{AS)ej is 
the jth diagonal entry of the orthogonal projection in onto the image of {ASy. Since this 
orthogonal projection has rank n, its trace is n. Therefore, if we define Cj{S) = \Rs{SA)ej\'^ , we 
have 

m 

^Cj{S) = n 
i=i 

for all S. Thus, by Jensen's inequality, 

m m 

Cj In(cj) > Cj ln(cj(S')) , 
with equality exactly when Cj{S) = cj for all j. Therefore, for all S, 



m ^ m ^ in 

D{A, c)>Y CjS{a, ■ Gs) - S{Gs) > - ln(det(i?^i)) + ^ E ^"(^i) " 2 ^ ^''^''^^ 
j=i j=i j=i 

so that 

m 1 / ™ 

D{A, c)+Y Ci Hcj) > 2 E 1'^^^) - lndet(i?-2) 

Moreover, as we see from the proof of Lemma 13.51 (based on an observation by Barthe) and 
Lemma 13.61 and the remarks made just above, there is equality when c € K'^ and S = Sq is 
the choice of S (unique up to a multiple) for which ()3.20|) is true. Let T denote the mxm diagonal 
matrix whose jth diagonal entry is tj = Ins^. Then ln(det(i?^^)) = In(det(j4e"^j4*) and therefore, 
if we define the function <I>^ by 

^A{ti,...,trn) = lndet{Ae^A') , 

we have, for every ti, . . . ,tm S K 

m m 

2(^D{A, c) + J2 Hcj)) + ^Aih, . . . , tm) > E *i (^•4) 

j=i i=i 

with equality, when c E for some choice of tj's. The function c — > 2D{A, c) + 2 Cj In(cj) 
is convex (because, as mentioned at the beginning of the previous section, the function c ^ D{a, c) 
is convex by definition), and its domain (i.e. where it is < +00) is K^. Therefore we get that 

m m 

^Aih, ■■■,tm)= sup \ V'cj tj - 2[d{A,c) + In(cj) ) i = sup i . . . i = sup i . . . i. 
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This shows that is convex on M™ and that it is the Legendre transform of the convex function 



Moreover, for given A and c, equahty in ()4.4p for some ti,...,tm means that for the corre- 
sponding values si, . . . , Sm, the Gaussian Gs is an extremizer for the variational problem defining 
D{A,c). By Theorem 13.91 tis means that {ai, . . . ,am} is totally reducible for c. 

Conversely, if {ai, . . . , a^} is totally reducible for c, then the variational problem in (j4.2p splits 
into a sum of independent and orthogonal (after a suitable linear transformation T) such problems, 
but of the interior type (i.e. c G K^^) for which Barthe showed optimiziers to exist. Equivalently, 
the next Theorem 14.31 ensures that we can find a positive operator R for which the decomposition 
of the identity (j3.19p holds. Then, as mentioned in the remark after the proof of Lemma 13.61 the 
random vector RG is extremal for D{A,c) and setting s| = Cj/\Raj\'^ we have that R = Rs and 
Cj{S) = Cj by construction (see the proof of Lemma [33]) . This guaranties equality at all steps of 
our computation above and thus ensures equality in (j4.4p □ 

Remark: We have proved that 



where denotes the Legendre transform of ^a- Since V^\{V^a{0)) = 0, the choice c = V^a{0) 
minimizes <l>^(c), and hence D{A,c) + X^JLi in(cj). There is a misprint in [11] in which it is 
stated (in slightly different notation) that this choice of c minimizes D{A,c) itself. 

We finally return to Lemma 13.51 ^-s we are now in a position to give necessary and sufficient 
conditions for the existence of the change of variables provided there. 

Let A = {ai, . . . , am} be family of m vectors spanning M", and let c be any vector in with 
< Cj < 1 for all j. Theorem 13.91 gives us necessary and sufficient conditions for the existence of an 
extremal X for the subadditivity inequality. By Theorem 12.21 these conditions are also necessary 
and sufficient for the existence of extremals for the Brascamp-Lieb inequality. Moreover, we see 
that extremals for the latter exist if and only if centered Gaussian extremals exist. 

From here, it is easy to prove the following theorem which supersedes Lemma 13.51 and gives 
necessary and sufficient conditions for the existence of the change of variables considered there. This 
result was obtained (in the more general multidimensional setting) by Bennett-Carbery-Christ- 
Tao [6j along their study of the Brascamp-Lieb extremizers ; here we use the extremizers to the 
subadditivity of entropy inequality. Though this theorem concerns a problem in linear algebra, we 
do not know a direct proof of it in a purely linear algebra context, though there may be one. 

4.3 THEOREM. Let {oi, . . . ,am} be any collection of m vectors that span M" for n > 2. Let 
{ci, . . . , Cm} be any m numbers satisfying < cj < 1 for each j. Then there exists an an invertible 
symmetric matrix n x n matrix R so that 



c 



2D{A,c) + 2ZT=iCjHcj). 





if and only if the set {ai, . . . , am} is totally reducible for c 



30 



Proof: The proof of Lemma 13.61 shows that whenever such a matrix R exists, there exists an 
optimizer for the subadditivity inequahty. Thus, by Theorem 13.91 the condition that {ai, . . . , am} 
is totally reducible for c is necessary. 

Conversely assume that {ai, . . . , Um} is totally reducible for c and that M" = Vj^ ® • • • ® Vj^, is 
the corresponding decomposition of M" from definition 13.81 We can then find an invertible operator 
T on M" such that the following orthogonal decomposition holds 

= TV7, ® ••• ® TFj,. 

Since we have cj. G ~ ^TAj i < m (with TAj. = [Taj,j G Jj]), we may use the Lemma [3. 5 1 
on each of the reduced orthogonal subspaces TVj. ; this gives us some symmetric invertible operator 
Ri on TVj. and putting all these operators together we get a symmetric invertible operator R on 
W such that 

Then the positive symmetric operator R = VT*R'^T satisfies the desired property 

□ 



5 A convolution inequality for eigenvalues 

We investigate here the dual of the superadditivity of Fisher information inequality (13.160 from 
Proposition 13.31 

In Section 2 we have shown that the Legendre transform of the entropy provides an equivalence 
between subadditivity of the entropy and Brascamp-Lieb inequalities. It turns out that the Fisher 
information is also a convex functional and its Legendre transform is known to be the smallest 
eigenvalue of a Schrodinger operator. (This is used extensively in the theory of large deviations, for 
example) . We shall use this fact to derive a subadditivity of the smallest eigenvalues of Schrodinger 
operators. 

For any continuous bounded function V on M", define 

X{V)=sup\ I y(x)(/>2(x)dx-4 / |V</>(x)p : [ (p'^{x)dx = l\ (5.1) 

Then —X{V) is the "ground state" eigenvalue of 

-4A - V , 

provided the bottom of the spectrum is an eigenvalue, and in any case, it is the bottom of the 
spectrum. 
Then since 

/(/)= / ^dx = 4 / |Vv7l^dx, 
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we can rewrite (|5.ip as 

A(F) =sup<J / V{x)f{x)dx-I{f) 



where the supremum is taken over all probability densities /. This gives us the analog of (|2.4p for 
Fisher information: 

V{x)fix)dx<X{V) + I{f) , (5.2) 



with equality if and only ii f = (j)^ where (— 4A — V)4> = —\{y)(j). (Here, by the definition ()5.ip of 
A(F), (j) is the "ground state" eigenfunction. 

Now let Vi, . . . Vn be continuous functions on M, and define 

n 

j=i 

where {ui, . . . , Un} is any orthonormal basis for M". Then 

n 

-4A - y = (-4(n,- • Vf - V,{u, ■ x)) , 
so that, by separation of variables, 

n 

X{y) = Y,KV,) . 
i=i 

The following result generalizes this to the case in which we have m unit vectors {ui, . . . ,Um} 
satisfying (j3.13p : 

5.1 THEOREM. Let {ui,...,Mm} he any m unit vectors in such that there are positive 
numbers ci, . . . , satisfying 

m 

Cj Uj (g) Uj = IdjRn . 

i=i 

For any m continuous bounded functions Vi, . . . , Vm on M, define on M" 

m 

V{x) = V{uj ■ x) . 

3=1 



Then 



Hv)<f:c,x(^Vj) . 

.7 = 1 ^""^ ^ 



(5.3) 



Proof: Choose an e > and a probability density / = such that 

Vix)(P'^{x) dx-4 [ |V0(x)p > X{V) - e . 
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Then using ()3.16p . 



A(F)-e < 




E / f(u,)m{t)dt-I{f) 



m „ m 



< 




< 




Since e > is arbitrary, this proves the result. 



□ 



The inequaUty (jS.Sp is sharp since one can use another Legendre transform, as in the proof of 
Theorem 12. H and see that it imphes the sharp inequahty (|3.16p . Inequahty (|5.3p could also be 
proved using a semi-group (or Stochastic) method inspired by the one used by Borell [8] in his 
study of Brunn-Minkowski type inequalities (which, somehow, are the converse of the inequalities 
considered here); this would be more complicated than starting from the inequality (13.160 for the 
Fisher information, though. 

An analogous result for functions on the sphere could be given using the sharp superadditivity 
of Fisher information inequality proved in [3] . 
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